Unsupervised Induction of Modern Standard Arabic Verb Classes Using Syntactic Frames and LSA

نویسندگان

  • Neal Snider
  • Mona T. Diab
چکیده

We exploit the resources in the Arabic Treebank (ATB) and Arabic Gigaword (AG) to determine the best features for the novel task of automatically creating lexical semantic verb classes for Modern Standard Arabic (MSA). The verbs are classified into groups that share semantic elements of meaning as they exhibit similar syntactic behavior. The results of the clustering experiments are compared with a gold standard set of classes, which is approximated by using the noisy English translations provided in the ATB to create Levin-like classes for MSA. The quality of the clusters is found to be sensitive to the inclusion of syntactic frames, LSA vectors, morphological pattern, and subject animacy. The best set of parameters yields an Fβ=1 score of 0.456, compared to a random baseline of an Fβ=1 score of 0.205.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Unsupervised Induction of Modern Standard Arabic Verb Classes

We exploit the resources in the Arabic Treebank (ATB) for the novel task of automatically creating lexical semantic verb classes for Modern Standard Arabic (MSA). Verbs are clustered into groups that share semantic elements of meaning as they exhibit similar syntactic behavior. The results of the clustering experiments are compared with a gold standard set of classes, which is approximated by u...

متن کامل

Unsupervised Induction of Modern Standard Arabic Verb Classes and Alternations

Verbs (in lemma form) and syntactic frames are automatically extracted from the ATB.! In order to acquire an argument structure for the verbs,!I only considered structure that is internal to the maximal Verb Phrase (VP) projection of the verb. However, within the VP, all sisters of the verb are excluded except for those in a close semantic relationship to the verb.! This is facilitated by the f...

متن کامل

A Step-wise Usage-based Method for Inducing Polysemy-aware Verb Classes

We present an unsupervised method for inducing verb classes from verb uses in gigaword corpora. Our method consists of two clustering steps: verb-specific semantic frames are first induced by clustering verb uses in a corpus and then verb classes are induced by clustering these frames. By taking this step-wise approach, we can not only generate verb classes based on a massive amount of verb use...

متن کامل

Inducing German Semantic Verb Classes from Purely Syntactic Subcategorisation Information

The paper describes the application of kMeans, a standard clustering technique, to the task of inducing semantic classes for German verbs. Using probability distributions over verb subcategorisation frames, we obtained an intuitively plausible clustering of 57 verbs into 14 classes. The automatic clustering was evaluated against independently motivated, handconstructed semantic verb classes. A ...

متن کامل

A Large Coverage Verb Taxonomy for Arabic

In this article I present a lexicon for Arabic verbs which exploits Levin’s verb-classes (Levin, 1993) and the basic development procedure used by (Schuler, 2005). The verb lexicon in its current state has 173 classes which contain 4392 verbs and 498 frames providing information about verb root, the deverbal form of the verb, the participle, thematic roles, subcategorisation frames and syntacti...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006